Skip to content

Conversation

chrisberkhout
Copy link
Contributor

Summary

A copy_from option for the Append processor, matching the option in the Set processor.

This makes it possible to refer to existing non-String values.

Discussion

The alternative is to use a Script processor, which requires Painless code to manually initialize the destination field and to avoid duplicates if necessary.

One variation on this use case is here and in the following 7 script processors.

I'd be happy to add an ignore_empty_value option as well, which would make it easier to collect values that may exist in multiple locations into one destination.


  • Have you signed the contributor license agreement? ✅
  • Have you followed the contributor guidelines? ✅
  • If submitting code, have you built your formula locally prior to submission with gradle check? ✅
  • If submitting code, is your pull request against main? Unless there is a good reason otherwise, we prefer pull requests against main and will backport as needed. ✅
  • If submitting code, have you checked that your submission is for an OS and architecture that we support? ✅
  • If you are submitting this code for a class then read our policy for that.

@chrisberkhout chrisberkhout added >enhancement :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team labels Jul 28, 2025
@elasticsearchmachine elasticsearchmachine added v9.2.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Jul 28, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine
Copy link
Collaborator

Hi @chrisberkhout, I've created a changelog YAML for you.

Copy link
Contributor

github-actions bot commented Jul 28, 2025

🔍 Preview links for changed docs

@chrisberkhout chrisberkhout force-pushed the append-processor-copy-from branch from d41a3fe to 89568f5 Compare July 28, 2025 12:00
Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to have this feature to avoid templating.

This change will also need an accompanying modification to the elasticsearch-specification to add the new parameter at https://github.com/elastic/elasticsearch-specification/blob/e585438d116b00ff34643179e6286e402c0bcaaf/specification/ingest/_types/Processors.ts#L329-L344

@chrisberkhout
Copy link
Contributor Author

chrisberkhout commented Jul 29, 2025

This change will also need an accompanying modification to the elasticsearch-specification to add the new parameter at https://github.com/elastic/elasticsearch-specification/blob/e585438d116b00ff34643179e6286e402c0bcaaf/specification/ingest/_types/Processors.ts#L329-L344

Thanks. I opened a draft PR for it: elastic/elasticsearch-specification#5056

@chrisberkhout chrisberkhout force-pushed the append-processor-copy-from branch 2 times, most recently from 062a5c4 to 726b6b9 Compare July 30, 2025 04:46
@joegallo joegallo self-assigned this Sep 2, 2025
@joegallo joegallo self-requested a review September 2, 2025 17:17
@elasticsearchmachine
Copy link
Collaborator

Hi @chrisberkhout, I've created a changelog YAML for you.

Copy link
Contributor

github-actions bot commented Sep 2, 2025

ℹ️ Important: Docs version tagging

👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version.

We use applies_to tags to mark version-specific features and changes.

Expand for a quick overview

When to use applies_to tags:

✅ At the page level to indicate which products/deployments the content applies to (mandatory)
✅ When features change state (e.g. preview, ga) in a specific version
✅ When availability differs across deployments and environments

What NOT to do:

❌ Don't remove or replace information that applies to an older version
❌ Don't add new information that applies to a specific version without an applies_to tag
❌ Don't forget that applies_to tags can be used at the page, section, and inline level

🤔 Need help?

@joegallo
Copy link
Contributor

joegallo commented Sep 2, 2025

I'd be happy to add an ignore_empty_value option as well, which would make it easier to collect values that may exist in multiple locations into one destination.

There's a PR up for that one already (which your comment reminded me of). My intention is to get this one merged and then return to that one so that they're both in for 9.2.0.

chrisberkhout and others added 9 commits September 3, 2025 11:34
@chrisberkhout chrisberkhout force-pushed the append-processor-copy-from branch from 2e74f27 to 76aae75 Compare September 3, 2025 09:34
@joegallo
Copy link
Contributor

joegallo commented Sep 3, 2025

Please don't force push on PRs that have already gotten review traction.

I'm sure you didn't change anything substantial in those commits when you force pushed them, but you could have, so arguably I have to review it all from scratch now. If I were going to sneak something into a codebase, I'd do it by working with somebody for a while on a review, and then force pushing something interesting back into one of the early commits...

@chrisberkhout
Copy link
Contributor Author

@joegallo Yeah, sorry about that. I used Github's "Update with rebase" button, so no manual changes but I get your point.

@joegallo
Copy link
Contributor

joegallo commented Sep 3, 2025

No worries! It's a small thing.

@joegallo
Copy link
Contributor

joegallo commented Sep 4, 2025

@chrisberkhout I added some rest tests (this is a detail you shouldn't have to care about). Is there anything else you meant to do on this PR or are you okay with me adding a ✅ and then merging it (once CI is green)?

@joegallo joegallo requested a review from andrewkroh September 4, 2025 18:29
@joegallo
Copy link
Contributor

joegallo commented Sep 4, 2025

@andrewkroh is there anything else you'd like to see here? I'm imagining @chrisberkhout will follow up on elastic/elasticsearch-specification#5056 separately (and that that has been stalled waiting on this to be merged, which has been stalled waiting on me to review).

| `allow_duplicates` | no | true | If `false`, the processor does not appendvalues already present in the field. |
| `media_type` | no | `application/json` | The media type for encoding `value`. Applies only when `value` is a[template snippet](docs-content://manage-data/ingest/transform-enrich/ingest-pipelines.md#template-snippets). Must be one of `application/json`, `text/plain`, or`application/x-www-form-urlencoded`. |
| `media_type` | no | `application/json` | The media type for encoding `value`. Applies only when `value` is a [template snippet](docs-content://manage-data/ingest/transform-enrich/ingest-pipelines.md#template-snippets). Must be one of `application/json`, `text/plain`, or`application/x-www-form-urlencoded`. |
| `description` | no | - | Description of the processor. Useful for describing the purpose of the processor or its configuration. |
| `if` | no | - | Conditionally execute the processor. See [Conditionally run a processor](docs-content://manage-data/ingest/transform-enrich/ingest-pipelines.md#conditionally-run-processor). |
| `ignore_failure` | no | `false` | Ignore failures for the processor. See [Handling pipeline failures](docs-content://manage-data/ingest/transform-enrich/ingest-pipelines.md#handling-pipeline-failures). |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding an example below to encourage copy_from usage. Since it doesn't have an ignore_empty_value, then maybe with an if to demonstrate how to accomplish that...

{
  "append": {
    "if": "ctx.host?.name instanceof String && !ctx.host.name.isEmpty()",
    "field": "related.hosts",
    "copy_from": "host.name",
    "allow_duplicates": false
  }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to return to an old PR that does add the ability to ignore empty values as my next task, so rather than re-rolling CI on this one I'm going to merge this as is but this comment is a commitment device that I will definitely return to with a link to the newly added docs when I add them.

@joegallo joegallo merged commit 5f7a2ed into elastic:main Sep 4, 2025
33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >enhancement external-contributor Pull request authored by a developer outside the Elasticsearch team Team:Data Management Meta label for data/management team v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants